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Abstract 

In this paper, we present a method of estimating ref- 
erents of demonstrative pronouns, personal pronouns, 
and zero pronouns in Japanese sentences using exam- 
ples, surface expressions, topics and foci. Unlike conven- 
tional work which was semantic markers for semantic 
constraints, we used examples for semantic constraints 
and showed in our experiments that examples are as 
useful as semantic markers. We also propose many new 
methods for estimating referents of pronouns. For exam- 
ple, we use the form "X of Y" for estimating referents of 
demonstrative adjectives. In addition to our new meth- 
ods, we used many conventional methods. As a result, 
experiments using these methods obtained a precision 
rate of 87% in estimating referents of demonstrative pro- 
nouns, personal pronouns, and zero pronouns for training 
sentences, and obtained a precision rate of 78% for test 
sentences. 

1 Overview 

This paper describes how to resolve the referents of pro- 
nouns: demonstrative pronouns, personal pronouns, and 
zero pronouns. Pronoun resolution is especially impor- 
tant for machine translation. For example, if the sys- 
tem cannot resolve zero pronouns^], it cannot translate 
sentences containing them from Japanese into English. 
When the word order of sentences is changed and the 
pronominalized words are changed in translation into 
English, the system must detect the referents of the pro- 
nouns. 

A lot of work has bee n done in Japanese pro no un reso- 
lution ( Kameyama 8t ) ( Yamamu ra ct al. 92|) ( Walker 



Condition =>■ {Proposal Proposal ..} 
Proposal := (Possible-Antecedent Points) 

Figure 1: Form of Candidate enumerating rule 



3t al. 94| ) ( [Takada fc Doi 94| ) ( |Nakaiwa fc Ikchara 95|) 
The main distinguishing features of our work are as fol- 
lows: 

• In conventional pronoun resolution methods, se- 
mantic markers have been used for semantic con- 
straints. On the other hand, we use examples for 
semantic constraints and show in our experiments 
that examples are as useful as semantic markers. 
This is an important result because the cost of con- 
structing the case frame using semantic markers is 
generally higher than the cost of constructing the 
case frame using examples. 

• We use examples in the form "X no Y" (Y of X) for 
estimating referents of demonstrative adjectives. 



Condition => (Points) 

Figure 2: Form of Candidate judging rule 



• We deal with the case when a demonstrative refers 
to elements that appear later. 

• We resolve a personal pronoun in a quotation by 
determining who is the speaker and who is the lis- 
tener. 

In this work, we used almost all the potentials of con- 
ventional methods and also propose a new method. 

2 The Framework for Estimating the 
Referent 

Prior to the pronoun resolution process, sentences are 
transform ed into a case structur e by a case structure 
analyzer (Kurohashi & Nagao 94). The antecedents of 



1 Omitted noun phrases are called zero pronouns. 



pronouns are determined by heuristic rules from left to 
right. Using these rules, our system assigns points to 
possible antecedents, and judges that the one having the 
maximum total score is the desired antecedent. 

Heuristic rules are classified into two kinds: Candi- 
date enumerating rules and Candidate judging rules. Can- 
didate enumerating rules are used in enumerating can- 
didate antecedents and giving them points (which rep- 
resent the plausibility of being the correct antecedent). 
Candidate judging rules are used in giving points to the 
candidate antecedents selected by Candidate enumerating 
rules. These rules are shown in Figures |l]and|^. Surface 
expressions, semantic constraints, referential properties, 
etc. are written as conditions in the Condition part. Pos- 
sible antecedents are written in the Possible- Antecedent 
part. Points means the plausibility of the possible an- 
tecedent. 

An estimation of the referent is performed using the 
total scores of possible antecedents given by Candidate 
enumerating rules and Candidate judging rules. First, the 
system applies all Candidate enumerating rules to the 
anaphor and enumerates candidate antecedents having 
points. Next, the system applies all Candidate judg- 
ing rules to all the candidate antecedents and sums the 
scores of all the candidate antecedents. Consequently, 



Tabic 1: The weight in the case of topic 



Surface expression 


Example 


Weight 


Pronoun/zero-pronoun ga/wa 


(John pa (subject)) shita (done). 


21 


Noun wa/niwa 


Johniua (subject) shita (do). 


20 



Tabic 2: The weight in the case of focus 



Surface expression 


Example 


Weight 


Pronoun/zero-pronoun wo (object)/ ni (to)/kara (from) 


(John ni (to)) shita (done). 


16 


Noun ga (subject) /mo/ da/ nara 


John qa (subject) shita (do). 


15 


Noun wo (object)/ni/, /. 


John ni (object) shita (do). 


14 


Noun he (to)/de (in) /kara (from) 


gakkou (school) he (to) iku (go). 


13 



the system judges the candidate antecedent having the 
best score to be the proper antecedent. If several can- 
didate referents have the best score, the candidate ref- 
erent selected first in order^] is judged to be the correct 
antecedent. 

We made 50 Candidate enumerating rules and 10 Can- 
didate judging rules for analyzing demonstratives, 4 Can- 
didate enumerating rules and 6 Candidate judging rules for 
analyzing personal pronouns, and 19 Candidate enumer- 
ating rules and 4 Candidate judging rules for analyzing 
zero pronouns. Some of the rules are described in the 
following sections. 

3 Heuristic Rules for Demonstratives 

We made he uristic ru le s for demon st ratives by consulting 
t he papers flNLRI 8l|) ( |Hayashi 8S| ) ( |Takahashi et al. 9c| ) 
(Kinsui & Takubo 92) and by examining Japanese sen- 



Table 3: Points given in the case of demonstrative 
pronouns 



tences by hand. Demonstratives have three categories: 
demonstrative pronouns, demonstrative adjectives, and 
demonstrative adverbs. In the following sections, we ex- 
plain the rules for analyzing demonstratives. 

3.1 Rule for Demonstrative Pronouns 
Rule in the case when the referent is a noun 
phrase 

Candidate enumerating rule 1 

When a pronoun is a demonstrative pronoun or "sono 
(of it) / kono (of this) / ano (of that)", 
{(A topic which has weight W and distance D, 
W - D - 2) 

(A focus which has weight W and distance D, W — 
D + A)} 

This bracketed expression represents the lists of pro- 
posals in Figure pj. The definition and weight W of 
the topic and focus are shown in Tables F] and ^. The 
distance (D) is the number of topics ana foci between 
the demonstrative and the possible referent. Since a 
demonstrative more often refers to foci than a zero pro- 
noun does, we add the coefficient —2 or +4 as compared 
with the heuristic rules in zero pronoun resolution. 



The score (in other words, the certification value) of a 
candidate referent depends on the weight of topics/foci 
and the physical distance between the demonstrative and 
the candidate referent. 

Rule when the referent is a verb phrase 
Candidate enumerating rule 2 



Sim. 





1 


2 


3 


4 


5 


6 


Exact 


Points 








-10 


-10 


-10 


-10 


-10 


-10 



Sim. 



Simlarity level 



When a pronoun is u kore/ sore/ are" or a demonstrative 
adjective, 

{ ( The previous sentence (or the verb phrase which is a 
conditional form containing a conjunctive particle such 
as "ga (but)", " daga (but)", and "keredo (but)" if the 
verb phrase is in the same sentence), 15)} 

The following is an example of a pronoun referring to 
the verb phrase in the previous sentence. 

tengu-wa maenoban-noyouni utattari odottari shihajimeta. 
(tengu) (the previous night) (sing) (dance) (begin to do) 
(Tengus began singing and dancing just as they had done 
the previous night.) 

ojiisan-wa sore- wo mite, kon'nahuuni utai-hajimeta. 
(the old man) (it) (see) (as follows) (begin to sing) 
(When the old man saw this , he began to sing as follows.) 

(1) 

In these sentences, a demonstrative pronoun "sore (it)" 
refers to the event "tengutachi-ga utattari odottari shi- 
hajimemashita (tengus began singing and dancing just 
as they had done the previous night. )"0. 

Rule using the feature that demonstrative 
pronouns usually do not refer to people 

Candidate judging rule 1 

When a pronoun is a demonstrative pronoun and a can- 
didate referent has a semantic marker HUM (human), 
it is given —10. We used the N oun Semantic Marker 
Dictionary^ Watanabe et al. 92) as a semantic marker 
dictionary]!. 

Candidate judging rule 2 

When a pronoun is a demonstrative pronoun, a candi- 
date referent is given the points in Table M by using the 
highest semantic similarity between the candidate refer- 
ent and the codes {5200003010 5201002060 5202001020 
5202006115 52 4] 0021 50 52440021 00} in "Bunrui Goi 
Hyou (BGH)" (NLRI 64) j which signify human beings. 



3 A tengu is a kind of monster. 

4 This dictionary includes semantic categories shown in Ta- 



The order is based on order applying rules. 



ble^ 

In BGH, each word has a number called a category num- 
ber. In an electrical version of BGH, each word has a 10-digit 



Table 4: Modification of category number of "bunrui 
goi hyou" 



Semantic marker 


Original 


Modified 




code 


code 


AN 1 (animal) 


156 


511 


HUM (human) 


12 [0-4] 


52 [0-4] 


ORG (organization) 


12 [5-8] 


53 [5-8] 


PLA(plant) 


155 


611 


PAR(part of living thing) 


157 


621 


NAT(natural) 


152 


631 


PRO(products) 


14[0-9] 


64[0-9] 


LOC(location) 


117,125,126 


651,652,653 


P H E (phenomenon) 


150,151 


711,712 


ACT(action) 


13[3-8] 


81 [3-8] 


MEN (mental) 


130 


821 


CHA(character) 


11 [2-58], 158 


83[2-58],839 


REL(rclation) 


111 


841 


LIN (linguistic products) 


131,132 


851,852 


Others 


110 


861 


TIM (time) 


116 


all 


QUA(quantity) 


119 


bll 



"125" and "126" are given two category numbers. 



When we calculate the semantic similarity, we use the 
modified code table in Table W. The r eason for this 
modification is that some codes in BGH (NLRI 64) are 
not suitable for semantic constraints. 



Table 5: Points given demonstrative pronouns which 
refer to places 



Sim. 





1 


2 


3 


4 


5 


6 


Exact 


Points 


-10 


-5 





5 


10 


10 


10 


10 



codes {6563006010 6559005020 9113301090 9113302010 
64710 01030 631 4020130} which signify locations in 
BGH (NLRI 64). 



u soko (there)" commonly refers to location. For ex- 
ample, "soko" in the following sentences refers to "baiten 
(shop)" which signifies location. 

koora-wo kaini baiten-ni hairimashita. 
(cola) (buy) (shop) (enter) 

(Taroo entered a shop to buy a cola.) 

soko-de guuzen dekuwashimashita. 

(meet) 



proo-wa 
(Jiroo) 



guuzen 
(by chance) 



(there) 

(Jiroo met Taroo there by chance.) 



Rule when "kokode" or "sokode" is used as a 
conjunction 

Candidate enumerating rule 3 

When a pronoun is "kokode" or "sokode", 
{(the pronoun is used as a conjunction, 11)} 



(3) 



These rules use the feature that a demonstrative pro- 
noun rarely refers to people. This reduces the num- 
ber of candidates of the referent. For example, we find 
"sore (it)" in the following sentences refers to "konpyuuta 
(computer)", because "sore (it)" can only refer to only 
a thing which is not human and the only noun which is 
near "sore (it)" and which is not human is "konpyuuta 
(computer)" . 

taroo-wa saishin-no konpyuuta-wo kaimashita. 
(Taroo) (new) (computer) (buy) 

(Taroo bought a new computer.) 

jon-ni sassoku sore - wo misemashita. 
(John) (at once) (it) (show) 
([He] showed it at once to John.) 

(2) 

Rule with feature that u koko" and "soko" 
often refer to locations 
Candidate judging rule 3 

When a pronoun is u koko (here) / soko (there) / asoko 
(over there)" and a candidate referent has a semantic 
marker LOC (location), the candidate referent is given 
10 points. 

Candidate judging rule 4 

When a pronoun is "koko/ soko/ asoko", a candidate ref- 
erent is given the points in Table based on the seman- 
tic similarity between the candidate referent and the 

category number. This 10-digit category number indicates 
seven levels of an is-a hierarchy. The top five levels are ex- 
pressed by the first five digits of a category number. The 
sixth level is expressed by the following two digits of a cat- 
egory number. The last level is expressed by the last three 
digits of a category number. 



This rule is for when "kokode (here or then)" or 
"sokode (there or then)" is used as a conjunction. If 
a word that signifies location is not found near "kokode" 
or "sokode", the candidate listed by this rule has the 
highest score, and u kokode" or "sokode" is judged to be 
a conjunction. By using this rule, "sokode" in the fol- 
lowing sentences is judged to be a conjunction. 

ojiisan-wa tengu-ga kowakunakunatte-imashita. 
(old man) (tengu) (lose all fear of) 
(The old man lost all fear of the tengus.) 

sokode ojiisan-wa kakureteita ana-kara detekimashita. 
(so) (old man) (be hiding) (hole) (leave) 
( So . he left the hole where he had been hiding.) 

(4) 

This rule is necessary when the system translates 
"sokode" into English, judges whether it is used as a 
demonstrative or as a conjunction, and translates it into 
"there" or "then." 

Rule when an anaphor does not have its 
antecedent 

Candidate enumerating rule 5 

When a pronoun is a demonstrative pronoun, a demon- 
strative adverb, or a demonstrative adjective, 
{(Introduce an individual, 10)} 

This rule is used when there is no referent of a pro- 
noun in the sentences. This rule makes the system in- 
troduce a certain individual. 

3.2 Rule for Demonstrative Adjectives 

Demonstrative pronouns such as u kono (this)", "sono 
(the)", "ono (that)", "kon'na (like this)", and ll son'na 
(like it)" are classified into two reference categories: 
genie j-reference and daifcem-reference. 



In a Gentei-reference although a demonstrative adjec- 
tive does not refer to an entity by itself, the phrase of 
"demonstrative adjective + noun phrase" refers to the 
antecedent. For example "kono ojiisan (this old man)" 
in the following sentences: 



Table 6: Points given to so-series demonstrative ad- 
jective 



Sim. 





1 


2 


3 


4 


5 


6 


Exact 


Points 


-10 


-2 


-1 





1 


2 


3 


4 



ojiisan-wa tengutachi-no-maeni deteitte odori-hajimemashita 
(old man) (before the tengus) (appear) (begin to dance) 
(He appeared before the tengus, and began to dance.) 

keredomo kono ojiisan-wa uta-mo odori-mo hetakuso-deshita 
(but) (this old man) (sing) (dance) (poor) 
(But the old man was a poor singer, and his dancing was 
no better.) 

(5) 

In this example, although the demonstrative "kono 
(this)" does not refer to "ojiisan (old man)" in the first 
sentence, the noun phrase "kono ojiisan (this old man)" 
refers to "ojiisan (old man)" in the first sentence. 

Daifcoti-reference is a demonstrative adjective that 
refers to an entity. In this case, we can analyze "sono 
(the)" as well as "sore-no (of it)". In the following sen- 
tences, "sono" refers to "tengu" (tengus). It is an exam- 
ple of daikou-reference. 



mata karasu-no-youna kao-wo-shita tengu-mo imashita 
(also) (like crows) (with face) (tengu) (exist) 

(There were also some tengus with faces like those of crows.) 

sono kuchi -wa torino-kuchibashi-noyouni togatte-imashita 
(their mouths) (like the beaks of birds) (be pointed) 
( Their mouths were pointed like the beaks of birds.) 

(6) 

Rules for geniei-reference and rfaifcoti-reference are as 
follows: 



Table 7: Examples of the form "the mouth of Noun 

X" 



Examples of Noun X 



hukuro (sack), ruporaita (documentary writer) tin (mcm- 
ber), akachan (baby), kare (he) 



the subordinate of noun phrase a. 

ojiisan-wa toonoiteiku tsuru-no sugata-wo miokurimashita. 
(old man) (recede) (crane) (figure) (watch) 
(The old man watched the receding figure of the crane.) 

" ano tori - wo tasukete yokatta" to iimashita. 

(that bird) (save) (glad) (say) 

("I'm glad I saved that bird ," said the old man to himself.) 

(7) 

In this example, the underlined "ano tori (that bird)" 
refers to a subordinate "tsuru (crane)" in the previous 
sentence. 

Rules for daikou-reference of so-series 
demonstrative adjective 

Candidate judging rule 5 

When a pronoun is a so-series demonstrative adjective, 
the system consults examples of the form "noun X no 
noun Y" whose noun Y is modified by the pronoun^ 
and gives a candidate referent the points in Table a 
according to the similarity between the ca ndid ate re f- 
erent and noun X in "Bunrui Goi Hvou" (jNLR.I 6l|). 



The Japanese Co-occurrence Dictionary (EDR 95c) is 
used as a source of examples of "X no Y" . 



Candidate enumerating rule 7 

When a pronoun is "demonstrative adjective + noun 
a," 

{ (the noun phrase containing a noun a, 45) 

(the topic which is a subordinate of noun a and which 

has weight W and distance D, W — D + 30) 

(the focus which is a subordinate of noun a and which 

has weight W and distance D, W - D + 30)} 



The relationships between a super-ordinate word and 
a subordinate word are detected by judging the last word 
in the defin ition of th e word a in EDR Japanese word 
dictionary (EDR 95a) to be the super-ordinate of the 
word a. 



Because of this rule, when a pronoun is "demon- 
strative adjective + noun phrase a" and there is the 
same noun phrase a near it, it is judged to be "gentei- 
reference" and is selected as a candidate of the refer- 
ent. When there is a subordinate of a noun phrase a 
near it, it is also selected as a candidate of the referent. 
These rules give higher points to a candidate referent 
than other rules do. The following is an example of the 
"demonstrative adjective + noun phrase a" referring to 



This rule is for checking the semantic constraint (For a 
daikou-reference, candidates of the referen t ar e selected 
by Candidate enumerating rule 1 in Section 3.1.). 



We explain how to use the rule in the underlined "sono 
(the)" in the sentences (^). First, the system gathers ex- 
amples of the form "Noun X no kuchi (mouth of Noun 
X)". Table | shows some examples of "Noun X no kuchi 
(mouth of Noun X )" in the Japanese Co-occurrence Dic- 
tionary (EDR 95c). Next, the system checks the seman- 



tic similarity between candidate referents and Noun X, 
and judges the candidate referent having a higher sim- 
ilarity to be a better candidate referent. In this exam- 
ple, "tengu" is semantically similar to Noun X in that 
they are both living things. Finally, the system selects 
"tengu" as the proper referent. 

Rules when non-.so-series demonstrative has 
daikou- reference 
Candidate judging rule 6 

When a pronoun is a non-so-series demonstrative adjec- 
tive, the system consults examples of the form "Noun 
X no (of) Noun Y (Y of X)" whose Noun Y is modi- 
fied by the pronoun, and gives candidate referents the 
points in Table ^ according to the similarity between 



the candidate re 



erent and noun X in "Bunrui Goi 



Table 8: Points given in the case of non-so-series 
demonstrative adjective 



Sim. 





1 


2 


3 


4 


5 


6 


Exact 


Points 


-30 


-30 


-30 


-30 


-10 


-5 


-2 






Table 9: Results of investigating whether "kon'na 
noun" (noun like this) refers to the previous or next 
sentences 



Postpositional particle 


previous 
sentence 


next 
sentence 


wa (topic) 


9 





wa-nai 


5 





ni (indirect object) 


17 





ni-mo 


1 





ni-wa 


2 





de (place) 


15 





de-wa 


5 





no (possessive) 


9 





sura 


2 





ga (subject) 


27 


22 


wo (object) 


43 


26 


mo (also) 


2 


4 


de-wa-nai 





1 


Total 


137 


53 



Hyou" (NLRI64). Sincp a non-SO-Se rifiS demo ns trative 
adjective rarely is a daikou reference (NLRI 81) (Yama- 



mura ct al. 92 ), the number of points is footnotesizcer 
than in the case of the so-series. 

Rule when a pronoun refers to a verb phrase 

Like a demonstrative pronoun, a demonstrative adjec- 
tive can refer to the meaning of the verb phrase in the 
previous sentence. This case is resolved by Candidate 



enumerating rule 2 in Section 

Rule for "kon'na noun" (noun like this) 

"kon'na noun" can also refer to the next sentences in 
addition to a noun phrase and the previous sentences. 

ojiisan-wa odorinagara kon'na uta- wo utaimashita. 
(old man) (dance) (song like this) (sing) 
(As he danced, he sang the following song: ) 

"tengu tengu hachi tengu. 
(tengu) (tengu) (eight tengu) 
("'Tengu,' 'tengu,' eight 'tengus.'") 

(8) 

In the above example, "kon'na uta (song like this)" refers 
to the next sentence "tengu, tengu, hachi tengu." 

But we cannot decide whether "kon'na + noun" (noun 
like this) refers to the previous or next sentences only by 
the expression of "kon'na + noun" (noun like this) it- 
self. To make the decision, we gathered 317 sentences 
containing "kon'na" (like this) from about 60,000 sen- 
tences in Japanese essays and editorials, and counted 
the total frequency of cases in which "kon'na" refers to 
the previous and next sentences. The results are shown 
in Table This table indicates that "kon'na + noun" 
followed by other particles, specifically "ga" and "wo," 
which are used when representing new information, very 
often refers to the previous sentence. Therefore, the sys- 
tem judges that the desired antecedent is the previous 



sentence. When "kon'na noun" is followed by the parti- 
cles "ga" or "wo," the proper referent is determined by 
the expression in quotation marks (","). 

3.3 Rule for Demonstrative Adverbs 

Rule when so-series demonstrative adverb 
refers to the previous sentences 

Candidate enumerating rule 9 

When an anaphor is a so-series demonstrative adverb 

such as "sou (so)," 

{(the previous sentences, 30)} 

The following is an example. 

"tengu tengu hachi tengu." 
(tengu) (tengu) (eight tengu) 
("'Tengu,' 'tengu,' eight 'tengus.'") 

sou utatta-nowa sokoni hachihiki-no tengu-ga itakara-desu. 
(sing so) (there) (eight) (tengu) (exist) 

(He sang so because he counted eight of them there. ) 

(9) 

"sou (so)" refers to the previous sentence "tengu tengu 
hachi tengu". 

Rule when so-series demonstrative adverb 
cataphorically Refers to the Verb Phrase 
in the Same Sentence 

Candidate enumerating rule 10 

When an anaphor is "sou/ soushite/ sonoyouni" and is 
in the subordinate clause which has a conjunctive par- 
ticle such as "ga", "daga", and "keredo" or an adjective 
conjunction such as "youni", 
{(the main clause, 45)} 



4 Heuristic Rule for Personal Pronouns 

Candidate enumerating rule 1 

When an anaphor is a first personal pronoun, 
{(the first person (the speaker) in the context, 25)} 

Candidate enumerating rule 2 

When an anaphor is a second personal pronoun, 
{(the second person (the hearer) in the context, 25)} 



A first or second personal pronoun is often presented 
in quotations, and can be resolved by estimating the 
first person (speaker) or the second person (hearer) in 
advance. The estimation of the first person and the sec- 
ond person is performed by regarding the ga-case (sub- 
jective) and ru-case (objective) components of the verb 
phase representing the speaking action of the quotation 
as the first and second persons, respectively. The detec- 
tion of the verb phase representing the speaking action 
is performed as follows. If the quotation is followed by a 
speaking action verb phrase such as "to itta (was said)," 
the verb phrase is regarded as the verb phase represent- 
ing the speaking action. Otherwise, the last verb phrase 
in the previous sentence is regarded as the verb phase 
representing the speaking action. For example, the sec- 
ond personal pronoun "omaesan (you)" in the following 
sentences refers to the second person "ojiisan (the old 



ojiisan-wa jimen-ni koshi-wo-oroshimashita. 
(old man) (ground) (sit down) 
( The old man sat down on the ground.) 

yagate (ojiisan-wa) nemutte-shimaimashita. 
(soon) (old man) (fall asleep) 
( He soon fell asleep.) 

Semantic Marker 

HUM/ANI ga (agent) nemuru (sleep) 

Example 

kare (he)/ inu (dog) ga (agent) nemuru (sleep) 

Figure 3: How to check semantic constraint 



Table 10: Points given by a verb-noun relationship 



Sim. 





1 


2 


3 


4 


5 


6 


Exact 


Points 


-10 


-2 


1 


2 


2.5 


3 


3.5 


4 



man)" in this quotation. 



to. 



asu, niata mairimasuyo. 

(tomorrow) (again) (come) 
("I'll come again tomorrow,") 

ojiisan-wa yakusoku-shimashita. 
(old man) (promise) 
(promised the old man.) 

"mochiron omaesan -wo utagauwakedewanainodaga" 
(of course) (you) (don't mean to doubt) 

("Of course, we don't mean to doubt you,") 

tengu-ga ojiisan-ni iimashita. 

(tcngu) (old man) (said) 

(said one of the "tengu" to the old man .) 

(10) 

The second person in the quotation is estimated to be 
"ojiisan" because the ra-case component of the verb 
phrase "iimashita (said)" representing the speaking ac- 
tion of the quotation is "ojiisan". 

Candidate enumerating rule 3 

When an anaphor is a third personal pronoun, 
{(a first person, —10) (a second person, —10)} 



5 Heuristic Rule for Zero Pronoun 

Rule proposing candidate referents of general 
zero pronoun 

Candidate enumerating rule 1 

When a zero pronoun is a ga-case component, 

{(A topic which has weight W and distance D, W — 

D*2 + l) 

(A focus which has weight W and distance D, W — D + 
1) 

(A subject of a clause coordinately connected to the 
clause containing the anaphor, 25) 

(A subject of a clause subordinately connected to the 
clause containing the anaphor, 23) 

(A subject of a main clause whose embedded clause 
contains the anaphor, 22)} 

Candidate enumerating rule 2 

When a zero pronoun is not a ga-case component, 
{(A topic which has weight W and distance D, W — 
D * 2 - 3) 

(A focus which has weight W and distance D, W — D * 
2 + 1)} 



Rule using semantic relation to verb phrase 

Candidate judging rule 1 

When a candidate referent of a case component (a zero 
pronoun) does not satisfy the semantic marker of the 
case component in the case frame, it is given —5. 



Candidate judging rule 2 

A candidate referent of a case co mp onent (a zero pro- 
noun) is given the points in Table lu by using the high- 
est semantic similarity between the candidate referent 
and examples of the case component in the case frame. 

These two rules are for checking the semantic con- 
straint between the candidate referent and the verb 
phrase which has the candidate referent in its case com- 
ponent. Candidate judging rule 1 checks semantic con- 
straints by using semantic markers. Candidate judging 
rule 2 checks semantic constraints by using examples. 
Figure ^ explains how to check semantic constraints in 
the example sentences. 

In the method using semantic markers, a candidate 
referent is the proper referent if one of the semantic 
markers belonging to the candidate referent is equal or 
subordinate to the semantic marker of the case compo- 
nent. For example, with respect to the zero pronoun in 
Figure ^, since the r/a-case component in the verb "ne- 
muru (sleep)" has the semantic markers HUM (human 
being) and AN I (animal) and since "ojiisan (old man)" 
has the semantic marker HUM, the proper referent is 
judged to be "ojiisan. " 

In the example-based method, the validity of a can- 
didate referent is decided by the semantic similarity be- 
tween the candidate referent and the examples of the 
case component in the verb case frame. The higher the 
semantic similarity is, the greater the validity is. For 
example, with respect to a zero pronoun in Figure [], 
since the examples of the ga-case are "kare (he)" and 
"inu (dog)," and since "ojiisan (old man)" is semanti- 
cally similar to "kare (he)", the proper referent is "oji- 
isan (old man)." 

These rules, which use semantic relationships to verbs, 
are also used in the estimation of the referent of demon- 
stratives and personal pronouns. 

Rule using the feature that it is difficult for 
a noun phrase to be filled in multiple case 
components of the same verb 

Candidate enumerating rule 4 

When there is "Noun X" in another case component of 
the verb which has the analyzed case component (the 
analyzed zero pronoun), {(Noun X, —20)} 

Rule using empathy 

This rule is based on empathy theory (Kameyama 86). 
When an anaphor is a ga-case zero pronoun whose verb 
is followed by an auxiliary verb such as "kureru" or "ku- 
dasaru" the ni-case zero pronoun is analyzed first, and 



doru souba-wa kitai-kara 130- y en- dai-ni joushoushita. 

(dollar) (the expectations) (130 yen) (surge) 

(The dollar has since rebounded to about 130 yen because of the expectations.) 



kono doru-daka-wa oushuu-tono kankei-wo gikushaku-saseteiru. 
(the dollar's surge) (Europe) (relation) (strain) 
(The dollar's surge is straining relations with Europe.) 



Rule 


Score of each candidate (points) 




the previous 
sentence 


new 
individual 


130 yen 
(130 yen) 


kitai 
(expectations) 


dorusouba 
(dollar) 


Candidate enumerating rule 2 
Candidate enumerating rule 5 
Candidate enumerating rule 1 
Candidate judging rule 6 


15 


10 


17 
-30 


15 
-30 


15 
-30 


Total score 


15 


10 


-13 


-15 


-15 



Figure 4: Example of resolving demonstrative "kono (this) 



Table 11: Results 



Text 


demonstrative 


personal pronoun 


zero pronoun 


total score 


Training 


87% (41/47) 


100% (9/ 9) 


86% (177/205) 


87% (227/261) 


Test 


86% (42/49) 


82% (9/11) 


76% (159/208) 


78% (210/268) 



The points given in each rule are manually adjusted by using the training sentences. 

Training sentences {example sentences (43 sentences), a folk tale "kobutori jiisan" ( ^Jakao 85 ) (93 sentences), an essay in 
"tenseijingo" (26 sentences), an editorial (26 sent ences) ari article in "Scientific American (in Japanese)" (16 sentences)} 
Test sentences {a folk tale "tsuru no ongaeshi" ( Nakao 8q ) (91 sentences), two essays in "tenseijingo" (50 sentences), an 
editorial (30 sentences), articles in "Scientific American (in Japanese)" (13 sentences)} 



it is filled with the noun phrase that has high empathy 
such as the topic, and a ga-case zero pronoun is filled 
with another noun phrase. 

6 Experiment and Discussion 
6.1 Experiment 

Before pronoun resolution, sentences were transf ormed 



into a case structur e by a case structure analyzer ( Kuro 



hashi & Nagao 94). The errors made by the structure 



analy zer were c orrected by hand. We used IPAL dictio- 
nary ( [PAL 87 ) as a verb case frame dictionary. We put 
together the case frames of the verb phrases which were 
not contained in this dictionary by consulting a large 
amount of linguistic data. 

An example of resolving the demonstrative "kono 
(this)" is shown in Figure 0, which shows that the ref- 
erent of the noun phrase ' jfcono dorudaka (this dollar's 
surge)" was properly judged to be the previous sentence. 

By Candidate enumerating rule 2 in Section ^, the sys- 
tem took a candidate "the previous sentence" and gave 
it 15 ooints. By Candidate enumerating rule 5 in Sec- 
tion pi the system took a candidate "new individual" 
and gave it 10 points. By Candidate enumerating rulel 
in Section ^, the system took three candidates, "130 
yen (130 yen)", "kitai (expectations)", and "dorusouba 
(dollar)", and gave them 17, 15, and 15 points, respec- 
tively. The system applied Candidate judging rule 6 to 
them. This uses examples of "X no Y". In this case, 
it used examples of "X no dorudaka (the dollar's surge 
of X)". The only example noun phrase X of this form 
"X no dorudaka" in the EDR occurrence dictionary was 
"saikin (recently)". All three candidates, "130 yen (130 



yen)", "kitai (expectations)", and "dorusouba (dollar)", 
were low in similarity to "saikin (recently)" in "Bun Rui 
Goihyou", and were given —30 points by Table ^| Two 
candidates, "the previous sentence" and "new individ- 
ual", so they are not noun phrases, and were not given 
points by Candidate judging rule 6. As a result, "the pre- 
vious sentence" had the highest score and was judged to 
be the proper referent. 

We show the results of our resolution of demonstra- 
tives, personal pronouns, and zero pronouns in Table 
The detailed results for demonstratives are shown in Ta- 
ble |l2[ The precision rate of zero pronouns is in the case 
when the system knows whether the zero pronoun has a 
referent or not in advance. 

6.2 Discussion 

With respect to demonstratives, the precision rate was 
over 80% even in the test sentences. This indicates that 
the rules used in this system are effective. But since 
Japanese demonstratives are classified into many kinds, 
the precision may be increased by making more detailed 
rules. In this work we used the feature that "kono (this)" 
rarely functions as a daifcoti-reference. There were four 
cases analyzed correctly because of this rule. 

With respect to personal pronouns, since only first 
and second personal pronouns appeared in the texts used 
in the experiment, almost all of the personal pronouns 
were resolved correctly by estimating the first and second 
persons in the quotation. The main reason for the errors 
in the personal pronoun resolution is that the ni-case 
zero pronoun was resolved incorrectly and the second 
person was estimated incorrectly. 



Table 12: Detailed results for demonstrative 



Text 


demonstrative 
pronoun 


demonstrative 
adjective 


demonstrative 
adverb 


total score 


Training 


83% (15/18) 


86% (19/22) 


100% (7/7) 


87% (41/47) 


Test 


82% (14/17) 


88% (23/26) 


83% (5/6) 


86% (42/49) 



Table 13: Results of comparison between semantic marker and example-base 





Method 1 


Method 2 


Method 3 


Method 4 


Method 5 


Demonstrative 


87% (41/47) 


83% (39/47) 


87% (41/47) 


83% (39/47) 


79% (37/47) 


86% (42/49) 


88% (43/49) 


88% (43/49) 


84% (41/49) 


86% (42/49) 


Personal pronoun 


100% (9/ 9) 


100% (9/ 9) 


100% (9/ 9) 


100% (9/ 9) 


89% (8/ 9) 


82% (9/11) 


64% (7/11) 


82% (9/11) 


55% (6/11) 


64% (7/11) 


Zero pronoun 


86%(177/205) 


8396(171/205) 


86%(176/205) 


82%(169/205) 


66%(135/205) 


76%(159/208) 


76%(158/208) 


79%(164/208) 


75%(155/208) 


63%(131/208) 



Method 1 : Using both semantic marker and example 

Method 2 : Using semantic marker 

Method 3 : Using example (using modified codes of bunrui goi hyou) 

Method 4 : Using example (using original codes of bunrui goi hyou) 

Method 5 : Using neither semantic marker nor example 



There are several reasons for the errors of the zero pro- 
noun resolution: there are errors in Japanese thesaurus 
"Bunrui goi hyou", Noun Semantic Marker Dictionary, 
and Case Frame Dictionary. 

6.3 Comparison Experiment 

As mentioned before, we use both the example rule and 
the semantic marker rule as judging rules. To check 
which rule is more effective, we made a comparison be- 
tween the example method and the semantic marker 
method. The results are shown in Table The up- 
per and lower rows of this table show the accuracy rates 
for training and test sentences, respectively. The pre- 
cision of the method using examples was equivalent or 
superior to that of the method using semantic markers, 
as shown in Table This indicates that we can use 
examples as well as semantic markers. Since some codes 
in BGH are incorrect, we modified them. Since the pre- 
cision using modified codes was higher than that using 
original codes, this indicates that the code modification 
is valid. 

7 Summary 

In this paper, we presented a method of estimating refer- 
ents of demonstrative pronouns, personal pronouns, and 
zero pronouns in Japanese sentences using examples, sur- 
face expressions, topics and foci. Unlike conventional 
works, which use semantic markers for semantic con- 
straints, we use examples for semantic constraints and 
showed in our experiments that examples are as useful 
as semantic markers. We also proposed many new meth- 
ods for estimating referents of pronouns. For example, 
we used the form "X of Y" for estimating referents of 
demonstrative adjectives. In addition to our new meth- 
ods, we used many conventional methods. As a result, 
experiments using these methods obtained a precision 
rate of 87% in estimating referents of demonstrative pro- 
nouns, personal pronouns, and zero pronouns for training 
sentences, and obtained a precision rate of 78% for test 
sentences. 
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